Variational Bayesian inference for the Latent Position Cluster Model for network data

نویسندگان

  • Michael Salter-Townshend
  • Thomas Brendan Murphy
چکیده

Many recent approaches to modeling social networks have focussed on embedding the actors in a latent “social space”. Links are more likely for actors that are close in social space than for actors that are distant in social space. In particular, the Latent Position Cluster Model (LPCM) [1] allows for explicit modelling of the clustering that is exhibited in many network datasets. However, inference for the LPCM model via MCMC is cumbersome and scaling of this model to large or even medium size networks with many interacting nodes is a challenge. Variational Bayesian methods offer one solution to this problem. An approximate, closed form posterior is formed, with unknown variational parameters. These parameters are tuned to minimize the Kullback-Leibler divergence between the approximate variational posterior and the true posterior, which known only up to proportionality. The variational Bayesian approach is shown to give a computationally efficient way of fitting the LPCM. The approach is demonstrated on a number of data sets and it is shown to give a good fit. 1 The Latent Position Cluster Model Handcock et al. [1] developed the Latent Position Cluster Model (LPCM) for social network data. The model involves locating each actor in a latent social space such that actors who are close in social space have a higher probability to form links than those distant in the social space. This model extended the Latent Space Model (LSM) [2] by incorporating a Gaussian mixture model structure for the latent positions of actors in social space, to accommodate the clustering of nodes in the network. Therefore clusters are included explicitly in the model rather that found by post-hoc analysis of the network model. A strength of the latent social space model is that it automatically represents link transitivity. We develop a variational Bayesian inference procedure for approximating the posterior distribution of the parameters in the LPCM. This approach provides computational tools to facilitate the application of the LPCM to larger networks than is currently possible using the existing MCMC methodology for model fitting. In the LPCM, a binary interactions data matrix Y is modelled using logistic regression in which the probability of a link between two nodes depends on the distance between the nodes in the latent space: log-odds(yi,j = 1|zi, zj , β) = β − |zi − zj | (1) where β is an intercept parameter and |zi−zj | is the Euclidean distance between the latent positions of nodes i and j. In addition, the links are assumed to be independent conditional on the latent positions of the actors in the latent space.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Latent Dirichlet Bayesian Co-Clustering

Co-clustering has emerged as an important technique for mining contingency data matrices. However, almost all existing coclustering algorithms are hard partitioning, assigning each row and column of the data matrix to one cluster. Recently a Bayesian co-clustering approach has been proposed which allows a probability distribution membership in row and column clusters. The approach uses variatio...

متن کامل

Nonparametric Bayesian Methods for Relational Clustering

An important task in data mining is to identify natural clusters in data. Relational clustering [1], also known as co-clustering for dyadic data, uses information about related objects to help identify the cluster to which an object belongs. For example, words can be used to help cluster documents in which the words occur; conversely, documents can be used to help cluster the words occurring in...

متن کامل

Parameter Estimation for the Latent Dirichlet Allocation

We review three algorithms for parameter estimation of the Latent Dirichlet Allocation model: batch variational Bayesian inference, online variational Bayesian inference and inference using collapsed Gibbs sampling. We experimentally compare their time complexity and performance. We find that the online variational Bayesian inference converges faster than the other two inference techniques, wit...

متن کامل

Algorithms of the LDA model [REPORT]

We review three algorithms for Latent Dirichlet Allocation (LDA). Two of them are variational inference algorithms: Variational Bayesian inference and Online Variational Bayesian inference and one is Markov Chain Monte Carlo (MCMC) algorithm – Collapsed Gibbs sampling. We compare their time complexity and performance. We find that online variational Bayesian inference is the fastest algorithm a...

متن کامل

Kernel Implicit Variational Inference

Recent progress in variational inference has paid much attention to the flexibility of variational posteriors. Work has been done to use implicit distributions, i.e., distributions without tractable likelihoods as the variational posterior. However, existing methods on implicit posteriors still face challenges of noisy estimation and can hardly scale to high-dimensional latent variable models. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 57  شماره 

صفحات  -

تاریخ انتشار 2013